crime vs socio-economic factors
Datavisualisatie Draft
Part A
Mijn twee gekozen datasets:
https://www.kaggle.com/datasets/mikejohnsonjr/united-states-crime-rates-by-county/data
Deze dataset bevat misdaadcijfers per county in de Verenigde Staten. Het heeft kolommen zoals, diefstal, verkrachting, moord, inwoners en county-namen. De dataset laat gedetailleerde informatie over verschillende misdaadtypes en de bevolkingsomvang zien, wat nuttig is voor criminologisch onderzoek en beleidsvorming.
number of instances: 3136
number of attributes: 24
variables:
- county_name (nominaal, object, discreet, 0 missing values)
- crime_rate_per_100000 (ratio, float64, continuous, 0 missing values)
- ROBBERY (ratio, int64, discreet, 0 missing values)
- MURDER (ratio, int64, discreet, 0 missing values)
- population (ratio, int64, discreet, 0 missing values)
Question to explore:
Is er een verband tussen de populatie van een county en de frequentie van bepaalde misdaden zoals verkrachtig en diefstal?
https://www.kaggle.com/datasets/muonneutrino/us-census-demographic-data
Deze dataset bevat gegevens per county in de Verenigde Staten. Het heeft kolommen zoals, waar mensen vandaan komen (asian, hispanic), werkeloos, thuiswerkenden, totale populatie en inkomen per persoon. Het laat vooral sociaaleconomische en demografische cijfers zien.
number of instances: 3142
number of attributes: 37
variables:
- County (nominaal, object, discreet, 0 missing values)
- Employed (ratio, int64, discreet, 0 missing values)
- Men (ratio, int64, discreet, 0 missing values)
- Hispanic (ratio, float64, continuous, 0 missing values)
- TotalPop (ratio, int64, discreet, 0 missing values)
Question to explore:
Hoe verschilt de werkgelegenheidssituatie tussen verschillende etnische groepen, zoals hispanic, asian, black, etc. in verschillende counties in de Verenigde Staten.
https://www.openintro.org/data/?data=county_complete
https://www.kaggle.com/code/stefancomanita/american-statistics-visualized-on-maps-w-plotly
● Relevant (based on what you were taught in class) descriptive statistics for the
above chosen 5 variables. Exclude missing values when calculating descriptive
statistics. You do not have to report Kurtosis and Skewness.
['crime_data_w_population_and_crime_rate.csv']
| county_name | crime_rate_per_100000 | index | EDITION | PART | IDNO | CPOPARST | CPOPCRIM | AG_ARRST | AG_OFF | ... | RAPE | ROBBERY | AGASSLT | BURGLRY | LARCENY | MVTHEFT | ARSON | population | FIPS_ST | FIPS_CTY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | St. Louis city, MO | 1791.995377 | 1 | 1 | 4 | 1612 | 318667 | 318667 | 15 | 15 | ... | 200 | 1778 | 3609 | 4995 | 13791 | 3543 | 464 | 318416 | 29 | 510 |
| 1 | Crittenden County, AR | 1754.914968 | 2 | 1 | 4 | 130 | 50717 | 50717 | 4 | 4 | ... | 38 | 165 | 662 | 1482 | 1753 | 189 | 28 | 49746 | 5 | 35 |
| 2 | Alexander County, IL | 1664.700485 | 3 | 1 | 4 | 604 | 8040 | 8040 | 2 | 2 | ... | 2 | 5 | 119 | 82 | 184 | 12 | 2 | 7629 | 17 | 3 |
| 3 | Kenedy County, TX | 1456.310680 | 4 | 1 | 4 | 2681 | 444 | 444 | 1 | 1 | ... | 3 | 1 | 2 | 5 | 4 | 4 | 0 | 412 | 48 | 261 |
| 4 | De Soto Parish, LA | 1447.402430 | 5 | 1 | 4 | 1137 | 26971 | 26971 | 3 | 3 | ... | 4 | 17 | 368 | 149 | 494 | 60 | 0 | 27083 | 22 | 31 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 3131 | Ohio County, IN | 0.000000 | 3132 | 1 | 4 | 762 | 6084 | 0 | 2 | 1 | ... | 0 | 0 | 0 | 2 | 2 | 0 | 0 | 5994 | 18 | 115 |
| 3132 | Newton County, MS | 0.000000 | 3133 | 1 | 4 | 1465 | 21545 | 3346 | 3 | 1 | ... | 0 | 0 | 0 | 4 | 0 | 1 | 0 | 21689 | 28 | 101 |
| 3133 | Jerauld County, SD | 0.000000 | 3134 | 1 | 4 | 2424 | 2108 | 2108 | 1 | 1 | ... | 0 | 0 | 0 | 1 | 3 | 1 | 0 | 2066 | 46 | 73 |
| 3134 | Cimarron County, OK | 0.000000 | 3135 | 1 | 4 | 2167 | 2502 | 2502 | 2 | 2 | ... | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 2335 | 40 | 25 |
| 3135 | Lawrence County, MS | 0.000000 | 3136 | 1 | 4 | 1453 | 12714 | 0 | 1 | 1 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 12514 | 28 | 77 |
3136 rows × 24 columns
2e dataset
| CountyId | State | County | TotalPop | Men | Women | Hispanic | White | Black | Native | ... | Walk | OtherTransp | WorkAtHome | MeanCommute | Employed | PrivateWork | PublicWork | SelfEmployed | FamilyWork | Unemployment | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1001 | Alabama | Autauga County | 55036 | 26899 | 28137 | 2.7 | 75.4 | 18.9 | 0.3 | ... | 0.6 | 1.3 | 2.5 | 25.8 | 24112 | 74.1 | 20.2 | 5.6 | 0.1 | 5.2 |
| 1 | 1003 | Alabama | Baldwin County | 203360 | 99527 | 103833 | 4.4 | 83.1 | 9.5 | 0.8 | ... | 0.8 | 1.1 | 5.6 | 27.0 | 89527 | 80.7 | 12.9 | 6.3 | 0.1 | 5.5 |
| 2 | 1005 | Alabama | Barbour County | 26201 | 13976 | 12225 | 4.2 | 45.7 | 47.8 | 0.2 | ... | 2.2 | 1.7 | 1.3 | 23.4 | 8878 | 74.1 | 19.1 | 6.5 | 0.3 | 12.4 |
| 3 | 1007 | Alabama | Bibb County | 22580 | 12251 | 10329 | 2.4 | 74.6 | 22.0 | 0.4 | ... | 0.3 | 1.7 | 1.5 | 30.0 | 8171 | 76.0 | 17.4 | 6.3 | 0.3 | 8.2 |
| 4 | 1009 | Alabama | Blount County | 57667 | 28490 | 29177 | 9.0 | 87.4 | 1.5 | 0.3 | ... | 0.4 | 0.4 | 2.1 | 35.0 | 21380 | 83.9 | 11.9 | 4.0 | 0.1 | 4.9 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 3137 | 56037 | Wyoming | Sweetwater County | 44527 | 22981 | 21546 | 16.0 | 79.6 | 0.8 | 0.6 | ... | 2.8 | 1.3 | 1.5 | 20.5 | 22739 | 78.4 | 17.8 | 3.8 | 0.0 | 5.2 |
| 3138 | 56039 | Wyoming | Teton County | 22923 | 12169 | 10754 | 15.0 | 81.5 | 0.5 | 0.3 | ... | 11.7 | 3.8 | 5.7 | 14.3 | 14492 | 82.1 | 11.4 | 6.5 | 0.0 | 1.3 |
| 3139 | 56041 | Wyoming | Uinta County | 20758 | 10593 | 10165 | 9.1 | 87.7 | 0.1 | 0.9 | ... | 1.1 | 1.3 | 2.0 | 19.9 | 9528 | 71.5 | 21.5 | 6.6 | 0.4 | 6.4 |
| 3140 | 56043 | Wyoming | Washakie County | 8253 | 4118 | 4135 | 14.2 | 82.2 | 0.3 | 0.4 | ... | 6.9 | 1.3 | 4.4 | 14.3 | 3833 | 69.8 | 22.0 | 8.1 | 0.2 | 6.1 |
| 3141 | 56045 | Wyoming | Weston County | 7117 | 3756 | 3361 | 1.4 | 91.6 | 0.5 | 0.1 | ... | 3.0 | 1.6 | 6.9 | 25.7 | 3407 | 68.2 | 21.9 | 8.8 | 1.1 | 2.2 |
3142 rows × 37 columns
3136 1877
| CountyId | State | County | TotalPop | Men | Women | Hispanic | White | Black | Native | ... | OtherTransp | WorkAtHome | MeanCommute | Employed | PrivateWork | PublicWork | SelfEmployed | FamilyWork | Unemployment | fips | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1001 | Alabama | Autauga County | 55036 | 26899 | 28137 | 2.7 | 75.4 | 18.9 | 0.3 | ... | 1.3 | 2.5 | 25.8 | 24112 | 74.1 | 20.2 | 5.6 | 0.1 | 5.2 | 01001 |
| 1 | 1003 | Alabama | Baldwin County | 203360 | 99527 | 103833 | 4.4 | 83.1 | 9.5 | 0.8 | ... | 1.1 | 5.6 | 27.0 | 89527 | 80.7 | 12.9 | 6.3 | 0.1 | 5.5 | 01003 |
| 2 | 1005 | Alabama | Barbour County | 26201 | 13976 | 12225 | 4.2 | 45.7 | 47.8 | 0.2 | ... | 1.7 | 1.3 | 23.4 | 8878 | 74.1 | 19.1 | 6.5 | 0.3 | 12.4 | 01005 |
| 3 | 1007 | Alabama | Bibb County | 22580 | 12251 | 10329 | 2.4 | 74.6 | 22.0 | 0.4 | ... | 1.7 | 1.5 | 30.0 | 8171 | 76.0 | 17.4 | 6.3 | 0.3 | 8.2 | 01007 |
| 4 | 1009 | Alabama | Blount County | 57667 | 28490 | 29177 | 9.0 | 87.4 | 1.5 | 0.3 | ... | 0.4 | 2.1 | 35.0 | 21380 | 83.9 | 11.9 | 4.0 | 0.1 | 4.9 | 01009 |
5 rows × 38 columns
| county_name | crime_rate_per_100000 | index | EDITION | PART | IDNO | CPOPARST | CPOPCRIM | AG_ARRST | AG_OFF | ... | ROBBERY | AGASSLT | BURGLRY | LARCENY | MVTHEFT | ARSON | population | FIPS_ST | FIPS_CTY | fips | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | St. Louis city, MO | 1791.995377 | 1 | 1 | 4 | 1612 | 318667 | 318667 | 15 | 15 | ... | 1778 | 3609 | 4995 | 13791 | 3543 | 464 | 318416 | 29 | 510 | 29510 |
| 1 | Crittenden County, AR | 1754.914968 | 2 | 1 | 4 | 130 | 50717 | 50717 | 4 | 4 | ... | 165 | 662 | 1482 | 1753 | 189 | 28 | 49746 | 5 | 35 | 05035 |
| 2 | Alexander County, IL | 1664.700485 | 3 | 1 | 4 | 604 | 8040 | 8040 | 2 | 2 | ... | 5 | 119 | 82 | 184 | 12 | 2 | 7629 | 17 | 3 | 17003 |
| 3 | Kenedy County, TX | 1456.310680 | 4 | 1 | 4 | 2681 | 444 | 444 | 1 | 1 | ... | 1 | 2 | 5 | 4 | 4 | 0 | 412 | 48 | 261 | 48261 |
| 4 | De Soto Parish, LA | 1447.402430 | 5 | 1 | 4 | 1137 | 26971 | 26971 | 3 | 3 | ... | 17 | 368 | 149 | 494 | 60 | 0 | 27083 | 22 | 31 | 22031 |
5 rows × 25 columns
| CountyId | State | County | TotalPop | Men | Women | Hispanic | White | Black | Native | ... | RAPE | ROBBERY | AGASSLT | BURGLRY | LARCENY | MVTHEFT | ARSON | population | FIPS_ST | FIPS_CTY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1001 | Alabama | Autauga County | 55036 | 26899 | 28137 | 2.7 | 75.4 | 18.9 | 0.3 | ... | 15.0 | 34.0 | 87.0 | 447.0 | 1233.0 | 85.0 | 108.0 | 55246.0 | 1.0 | 1.0 |
| 1 | 1003 | Alabama | Baldwin County | 203360 | 99527 | 103833 | 4.4 | 83.1 | 9.5 | 0.8 | ... | 30.0 | 76.0 | 332.0 | 967.0 | 3829.0 | 192.0 | 31.0 | 195540.0 | 1.0 | 3.0 |
| 2 | 1005 | Alabama | Barbour County | 26201 | 13976 | 12225 | 4.2 | 45.7 | 47.8 | 0.2 | ... | 4.0 | 8.0 | 36.0 | 90.0 | 362.0 | 21.0 | 0.0 | 27076.0 | 1.0 | 5.0 |
| 3 | 1007 | Alabama | Bibb County | 22580 | 12251 | 10329 | 2.4 | 74.6 | 22.0 | 0.4 | ... | 4.0 | 8.0 | 36.0 | 122.0 | 251.0 | 27.0 | 0.0 | 22512.0 | 1.0 | 7.0 |
| 4 | 1009 | Alabama | Blount County | 57667 | 28490 | 29177 | 9.0 | 87.4 | 1.5 | 0.3 | ... | 11.0 | 9.0 | 101.0 | 397.0 | 865.0 | 86.0 | 9.0 | 57872.0 | 1.0 | 9.0 |
5 rows × 62 columns
(3142, 62)
----------------------------------------
CountyId 0
State 0
County 0
TotalPop 0
Men 0
..
MVTHEFT 9
ARSON 9
population 9
FIPS_ST 9
FIPS_CTY 9
Length: 62, dtype: int64
--------------------------------------------------------------------------------
Index(['CountyId', 'State', 'County', 'TotalPop', 'Men', 'Women', 'Hispanic',
'White', 'Black', 'Native', 'Asian', 'Pacific', 'VotingAgeCitizen',
'Income', 'IncomeErr', 'IncomePerCap', 'IncomePerCapErr', 'Poverty',
'ChildPoverty', 'Professional', 'Service', 'Office', 'Construction',
'Production', 'Drive', 'Carpool', 'Transit', 'Walk', 'OtherTransp',
'WorkAtHome', 'MeanCommute', 'Employed', 'PrivateWork', 'PublicWork',
'SelfEmployed', 'FamilyWork', 'Unemployment', 'fips', 'county_name',
'crime_rate_per_100000', 'index', 'EDITION', 'PART', 'IDNO', 'CPOPARST',
'CPOPCRIM', 'AG_ARRST', 'AG_OFF', 'COVIND', 'INDEX', 'MODINDX',
'MURDER', 'RAPE', 'ROBBERY', 'AGASSLT', 'BURGLRY', 'LARCENY', 'MVTHEFT',
'ARSON', 'population', 'FIPS_ST', 'FIPS_CTY'],
dtype='object')
0.0 1791.995377
6218279 59
10105722 74
28.8 0.0
52.0 2.4
69529 9334
['CountyId', 'State', 'County', 'TotalPop', 'Men', 'Women', 'Hispanic', 'White', 'Black', 'Native', 'Asian', 'Pacific', 'VotingAgeCitizen', 'Income', 'IncomeErr', 'IncomePerCap', 'IncomePerCapErr', 'Poverty', 'ChildPoverty', 'Professional', 'Service', 'Office', 'Construction', 'Production', 'Drive', 'Carpool', 'Transit', 'Walk', 'OtherTransp', 'WorkAtHome', 'MeanCommute', 'Employed', 'PrivateWork', 'PublicWork', 'SelfEmployed', 'FamilyWork', 'Unemployment', 'fips', 'county_name', 'crime_rate_per_100000', 'index', 'EDITION', 'PART', 'IDNO', 'CPOPARST', 'CPOPCRIM', 'AG_ARRST', 'AG_OFF', 'COVIND', 'INDEX', 'MODINDX', 'MURDER', 'RAPE', 'ROBBERY', 'AGASSLT', 'BURGLRY', 'LARCENY', 'MVTHEFT', 'ARSON', 'population', 'FIPS_ST', 'FIPS_CTY']
'aggrnyl', 'agsunset', 'algae', 'amp', 'armyrose', 'balance', 'blackbody', 'bluered', 'blues', 'blugrn', 'bluyl', 'brbg', 'brwnyl', 'bugn', 'bupu', 'burg', 'burgyl', 'cividis', 'curl', 'darkmint', 'deep', 'delta', 'dense', 'earth', 'edge', 'electric', 'emrld', 'fall', 'geyser', 'gnbu', 'gray', 'greens', 'greys', 'haline', 'hot', 'hsv', 'ice', 'icefire', 'inferno', 'jet', 'magenta', 'magma', 'matter', 'mint', 'mrybm', 'mygbm', 'oranges', 'orrd', 'oryel', 'oxy', 'peach', 'phase', 'picnic', 'pinkyl', 'piyg', 'plasma', 'plotly3', 'portland', 'prgn', 'pubu', 'pubugn', 'puor', 'purd', 'purp', 'purples', 'purpor', 'rainbow', 'rdbu', 'rdgy', 'rdpu', 'rdylbu', 'rdylgn', 'redor', 'reds', 'solar', 'spectral', 'speed', 'sunset', 'sunsetdark', 'teal', 'tealgrn', 'tealrose', 'tempo', 'temps', 'thermal', 'tropic', 'turbid', 'turbo', 'twilight', 'viridis', 'ylgn', 'ylgnbu', 'ylorbr', 'ylorrd'
Correlatie tussen het inkomen en de criminaliteits tarief: -0.14
Part B
| country | total_cases | total_deaths | total_vaccinations | population | |
|---|---|---|---|---|---|
| 0 | Afghanistan | 230375.0 | 7973.0 | 2.296475e+07 | 4.112877e+07 |
| 1 | Africa | 13133432.0 | 259066.0 | 8.632379e+08 | 1.426737e+09 |
| 2 | Albania | 334596.0 | 3604.0 | 3.088966e+06 | 2.842318e+06 |
| 3 | Algeria | 272010.0 | 6881.0 | NaN | 4.490323e+07 |
| 4 | American Samoa | 8359.0 | 34.0 | NaN | 4.429500e+04 |
| ... | ... | ... | ... | ... | ... |
| 248 | Wallis and Futuna | 3550.0 | 8.0 | 1.805800e+04 | 1.159600e+04 |
| 249 | World | 773948532.0 | 7015947.0 | 1.357576e+10 | 7.975105e+09 |
| 250 | Yemen | 11945.0 | 2159.0 | 1.298654e+06 | 3.369661e+07 |
| 251 | Zambia | 349304.0 | 4069.0 | 1.345421e+07 | 2.001767e+07 |
| 252 | Zimbabwe | 266071.0 | 5731.0 | NaN | 1.632054e+07 |
253 rows × 5 columns
| county_name | crime_rate_per_100000 | index | EDITION | PART | IDNO | CPOPARST | CPOPCRIM | AG_ARRST | AG_OFF | ... | RAPE | ROBBERY | AGASSLT | BURGLRY | LARCENY | MVTHEFT | ARSON | population | FIPS_ST | FIPS_CTY | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | St. Louis city, MO | 1791.995377 | 1 | 1 | 4 | 1612 | 318667 | 318667 | 15 | 15 | ... | 200 | 1778 | 3609 | 4995 | 13791 | 3543 | 464 | 318416 | 29 | 510 |
| 1 | Crittenden County, AR | 1754.914968 | 2 | 1 | 4 | 130 | 50717 | 50717 | 4 | 4 | ... | 38 | 165 | 662 | 1482 | 1753 | 189 | 28 | 49746 | 5 | 35 |
| 2 | Alexander County, IL | 1664.700485 | 3 | 1 | 4 | 604 | 8040 | 8040 | 2 | 2 | ... | 2 | 5 | 119 | 82 | 184 | 12 | 2 | 7629 | 17 | 3 |
| 3 | Kenedy County, TX | 1456.310680 | 4 | 1 | 4 | 2681 | 444 | 444 | 1 | 1 | ... | 3 | 1 | 2 | 5 | 4 | 4 | 0 | 412 | 48 | 261 |
| 4 | De Soto Parish, LA | 1447.402430 | 5 | 1 | 4 | 1137 | 26971 | 26971 | 3 | 3 | ... | 4 | 17 | 368 | 149 | 494 | 60 | 0 | 27083 | 22 | 31 |
5 rows × 24 columns
county_name 0 crime_rate_per_100000 0 index 0 EDITION 0 PART 0 IDNO 0 CPOPARST 0 CPOPCRIM 0 AG_ARRST 0 AG_OFF 0 COVIND 0 INDEX 0 MODINDX 0 MURDER 0 RAPE 0 ROBBERY 0 AGASSLT 0 BURGLRY 0 LARCENY 0 MVTHEFT 0 ARSON 0 population 0 FIPS_ST 0 FIPS_CTY 0 dtype: int64
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) /var/folders/cz/xk86tnc93sg47gkrn84h27xm0000gn/T/ipykernel_31215/2845538389.py in ?() ----> 1 df3 = covid_df_aggregated.merge(df2, left_on='country', right_on='Country/Territory', how='inner') 2 display(df3.head()) /opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.12/site-packages/pandas/core/frame.py in ?(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 10828 validate: MergeValidate | None = None, 10829 ) -> DataFrame: 10830 from pandas.core.reshape.merge import merge 10831 > 10832 return merge( 10833 self, 10834 right, 10835 how=how, /opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.12/site-packages/pandas/core/reshape/merge.py in ?(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 166 validate=validate, 167 copy=copy, 168 ) 169 else: --> 170 op = _MergeOperation( 171 left_df, 172 right_df, 173 how=how, /opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.12/site-packages/pandas/core/reshape/merge.py in ?(self, left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, indicator, validate) 790 self.right_join_keys, 791 self.join_names, 792 left_drop, 793 right_drop, --> 794 ) = self._get_merge_keys() 795 796 if left_drop: 797 self.left = self.left._drop_labels_or_levels(left_drop) /opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.12/site-packages/pandas/core/reshape/merge.py in ?(self) 1293 # Then we're either Hashable or a wrong-length arraylike, 1294 # the latter of which will raise 1295 rk = cast(Hashable, rk) 1296 if rk is not None: -> 1297 right_keys.append(right._get_label_or_level_values(rk)) 1298 else: 1299 # work-around for merge_asof(right_index=True) 1300 right_keys.append(right.index._values) /opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.12/site-packages/pandas/core/generic.py in ?(self, key, axis) 1907 values = self.xs(key, axis=other_axes[0])._values 1908 elif self._is_level_reference(key, axis=axis): 1909 values = self.axes[axis].get_level_values(key)._values 1910 else: -> 1911 raise KeyError(key) 1912 1913 # Check for duplicates 1914 if values.ndim > 1: KeyError: 'Country/Territory'